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DETAILED ACTION 
Remarks 

1 . In response to Applicant's Amendment filed on December 29, 2008, claims 1-20 are 
pending in the application. 

Claim Rejections - 35 USC §103 

2. The following is a quotation of 35 U.S. C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

3. Claims 1, 4-8, 9, and 12-16 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Copperman et al. (U.S. Patent No. 6,71 1,585 Bl) as in view of Adamske et al. (U.S. Patent 
No. 6,615,234 Bl). 

As to claims 1, and 9, Copperman et al. discloses an apparatus for automatically 
extracting metadata from electronic documents comprising a first processing element, a second 
processing element, a reasoning element, and a database, wherein, 

i) said first processing element is further configured to convert electronic documents 
into files (See column 12, lines 62-67, also see abstract); 

ii) said first processing element is configured to provide the files to a second 
processing element (See column 18, lines 7-10, and see column 18, lines 44-48); 
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iii) said second processing element is configured to receive said files and extract 
predetermined information (See column 2, lines 21-24); 

iv) said second processing element is further configured to provide said extracted 
predetermined information to said reasoning element (See column 2, lines 25-31); 

v) said database is configured to also provide input to said reasoning element (See 
column 13, lines 1-10); 

vi) said reasoning element is configured to employ a set of rules to automatically 
extract metadata from the files by employing the extracted predetermined information and the 
input from the database (See column 12, lines 45-51, also see column 13, lines 37-63, wherein 
"reasoning element" is the processing and analysis done by the "autocontextualization"); and 

vii) reasoning element provides an output of metadata (See Figure 22). 
Copperman et al. discloses the claimed invention while teachings files are substantially 

format invariant data file in column 12, lines 62-67, but is not directed to the conversion of 
electronic documents into PostScript files. 

Adamske et al. teaches the conversion of electronic documents into PostScript files (See 
Adamske et al. Abstract, and see Adamske et al. column 5, lines 34-46). 

It would have been obvious to one of ordinary skill in the art at the time the invention 
was made to have modified Copperman et al. by the teaching of Adamske et al. to include the 
conversion of electronic documents into PostScript files because it allows for portability and 
accessibility across communication networks, and machines (ease of printing). 
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As to claims 4, and 12, Copperman et al. as modified discloses wherein the second 
processing element and said database simultaneously input to the reasoning element (See 
Copperman et al. Figure 5, 510, Copperman et al. column 9, lines 38-58). 

As to claims 5, and 13, Copperman et al. as modified discloses wherein said set of rules is 
updated (See Copperman et al. column 16, lines 1-13). 

As to claims 6, and 14, Copperman et al. as modified discloses wherein said metadata is 
substantially comprised of title, author, affiliation, author affiliation, and table of contents (See 
Figure 2, and see column 13, lines 41-50). 

As to claims 7, and 15, Copperman et al. as modified discloses wherein said metadata is 
provided to a user interface (See Copperman et al. Figure 21). 

As to claims 8, and 16, Copperman et al. as modified discloses wherein said metadata is 
provided to a storage medium (See Copperman et al. Figure 2, also see Copperman et al. column 
18, lines 44-48). 



3. Claims 3,11, 17, and 19 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Copperman et al. (U.S. Patent No. 6,711,585 Bl) in view of Adamske et al. (U.S. Patent No. 
6,615,234 Bl), and further in view of Mahonev et al. (U.S. Patent No. 5,999,664) 
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As to claims 3 and 1 1 , Copperman et al. teaches the claimed invention but does not 
explicitly teach wherein said predetermined information is substantially spatial layout facts. 
However, Copperman et al. teaches maintaining and considering topical distance relationships as 
well as boundaries of the original document (See Copperman et al. column 16, lines 41-59). 

Copperman et al. still does not teach the spatial layout facts being augmented strings of 
text, where each spatial layout fact contains a string of text and spatial data regarding the string 
of text. 

Mahoney et al. teaches extracting special layout facts (See Mahoney et al. Abstract), and 
the spatial layout facts being augmented strings of text, where each spatial layout fact 
contains a string of text and spatial data regarding the string of text (See Mahoney et al. column 
4, lines 47-64, and see Mahoney et al. column 8, lines 25-38, and Mahoney et al. column 10, 
lines 49-67). 

It would have been obvious to one of ordinary skill in the art at the time the invention 
was made to have further modified Copperman et al. as modified by the teaching of Mahoney et 
al. to include wherein said predetermined information is substantially spatial layout facts because 
it provides for accurate document presentation to the users (See Mahoney et al. column 2, lines 
30-36). 

As to claims 17, and 19, Copperman et al. as modified still does not teach wherein each 
string of text is bound by a bounding box and wherein the spatial data includes: 

a) a page number of the electronic document where the string of text 
appears; 
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b) an absolute line counter order for each string of text; 

c) an x-y location of a lower left comer of a bounding box bounding 
the string of text; 

d) an x-y location of an upper right comer of the bounding box; and e) font metrics of 
bounding box extensions used to represent the string of text. 

Mahoney et al. teaches wherein each string of text is bound by a bounding box and 
wherein the spatial data includes: 

a) a page number of the electronic document where the string of text 
appears (See , wherein it is inherent that pages are numbers for ease of locating); 

b) an absolute line counter order for each string of text (See column 10, Table 1); 

c) an x-y location of a lower left comer of a bounding box bounding the string of text 
(See column 11, Table 2, wherein the location can be defined as desired); 

d) an x-y location of an upper right comer of the bounding box (See column 11, Table 2); 

and 

e) font metrics of bounding box extensions used to represent the string of text (See 
column 27, lines 55-65). 

It would have been obvious to one of ordinary skill in the art at the time the invention 
was made to have further modified Copperman et al. as modified by the teaching of Mahoney et 
al. to include specified text bounding boxes because its well known method of extracting and 
ordering structure layout for document presentation for consistency and accuracy. 
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4. Claims 18 and 20 are rejected under 35 U.S. C. 103(a) as being unpatentable over 
Copperman et al. (U.S. Patent No. 6,711,585 Bl) in view of Adamske et al. (U.S. Patent No. 
6,615,234 Bl), and further in view of DX Le, & GR Thoma. Automated document labeling using 
integrated image and neural processing published in 1999 (from hereon in Le et al. ). 

As to claims 18, and 20, Copperman et al. as modified teaches wherein the document 
includes at least a first page and text with a small font and other text with a large font, and 
wherein the set of rules include: 

Copperman et al. as modified does not teach 

a) a rule for extracting a title of the document, such that the title is identified as being 
located on an upper portion of the first page of the document and is printed using the large font; 

b) a rule for extracting authors of the document, such that authors are 
identified as being listed immediately under the title in some order; 

c) a rule for extracting author affiliations, such that author affiliations are identified as 
being located as text following the authors; and 

d) a rule for affiliating the authors with the author affiliations such that if only one 
affiliation appears, then all authors are associated with the one affiliation. 

Le et al. teaches wherein the set of rules include: 

a) a rule for extracting a title of the document, such that the title is identified as being 
located on an upper portion of the first page of the document and is printed using the large font 
(See section 3, wherein it is inherent that the title is in larger font to be visually noticeable); 
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b) a rule for extracting authors of the document, such that authors are 
identified as being listed immediately under the title in some order (See section 3); 

c) a rule for extracting author affiliations, such that author affiliations are identified as 
being located as text following the authors (See section 3); and 

d) a rule for affiliating the authors with the author affiliations such that if only one 
affiliation appears, then all authors are associated with the one affiliation (See section 3, wherein 
it is inherent that multiple authors collaborate together typically from a single affiliation). 

It would have been obvious to one of ordinary skill in the art at the time the invention 
was made to have further modified Copperman et al. as modified by the teaching of Le et al. to 
include the defining specific rules for the layout since not only is it well known in the art that 
rules are user defined but that document showing title, order, affiliation in that order have 
become well known standards for technical publications on the internet. 

Response to Arguments 

5. Applicant's arguments with respect to the claims have been considered but are moot in 
view of the new ground(s) of rejection. 

Conclusion 

6. Applicant's amendment necessitated the new ground(s) of rejection presented in this 
Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). 
Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). 
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A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within TWO 
MONTHS of the mailing date of this final action and the advisory action is not mailed until after 
the end of the THREE-MONTH shortened statutory period, then the shortened statutory period 
will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 
CFR 1 .136(a) will be calculated from the mailing date of the advisory action. In no event, 
however, will the statutory period for reply expire later than SIX MONTHS from the date of this 
final action. 

7. The prior art made of record and not relied upon is considered pertinent to applicant's 
disclosure. For complete list of cited relevant prior art, see PTO-Form 892. 

8. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Neveen Abel-Jalil whose telephone number is 571-272-4074. 
The examiner can normally be reached on 8:30AM-5:30PM EST. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Christian Chace can be reached on 571-272-4190. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 



Application/Control Number: 09/835,064 Page 10 

Art Unit: 2165 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would 
like assistance from a USPTO Customer Service Representative or access to the automated 
information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

Neveen Abel-Jalil 
Primary Examiner 
April 7, 2009 

/Neveen Abel-Jalil/ 

Primary Examiner, Art Unit 2165 



